Running MSAF

The main MSAF functionality is demonstrated here.


In [ ]:
from __future__ import print_function
import msaf
import librosa
import seaborn as sns

# and IPython.display for audio output
import IPython.display

# Setup nice plots
sns.set(style="dark")
%matplotlib inline

Single File Mode

This mode analyzes one audio file at a time.


In [ ]:
# Choose an audio file and listen to it
audio_file = "../datasets/Sargon/audio/01-Sargon-Mindless.mp3"
IPython.display.Audio(filename=audio_file)

In [33]:
# Segment the file using the default MSAF parameters
boundaries, labels = msaf.process(audio_file)
print(boundaries)


[   0.            7.89478458   32.78657596   46.02195011   79.18004535
   90.65070295  112.70965986  129.42802721  142.94204082  154.41269841
  168.06603175  181.02276644  193.00426304  208.09723356  221.37904762
  234.15002268  288.48471655  317.04526077  330.00199546  344.90920635
  377.04562358  390.46675737  403.56281179  409.089161    414.14530612]

In [ ]:
# Sonify boundaries
sonified_file = "my_boundaries.wav"
sr = 44100
boundaries, labels = msaf.process(audio_file, sonify_bounds=True, 
                                  out_bounds=sonified_file, out_sr=sr)

# Listen to results
audio = librosa.load(sonified_file, sr=sr)[0]
IPython.display.Audio(audio, rate=sr)

Using different Algorithms

MSAF includes multiple algorithms both for boundary retrieval and structural grouping (or labeling). In this section we demonstrate how to try them out.

Note: more algorithms are available in msaf-gpl.


In [34]:
# First, let's list all the available boundary algorithms
print(msaf.get_all_boundary_algorithms())


['cnmf', 'foote', 'olda', 'scluster', 'sf']

In [35]:
# Try one of these boundary algorithms and print results
boundaries, labels = msaf.process(audio_file, boundaries_id="foote", plot=True)



In [36]:
# Let's check all the structural grouping (label) algorithms available
print(msaf.get_all_label_algorithms())


['cnmf', 'fmc2d', 'scluster']

In [37]:
# Try one of these label algorithms
boundaries, labels = msaf.process(audio_file, boundaries_id="foote", labels_id="fmc2d")
print(boundaries)
print(labels)


[   0.           22.01251701   33.20453515   46.48634921   91.06866213
  113.12761905  129.84598639  168.43755102  190.58938776  220.49668934
  236.51845805  264.01088435  286.85931973  302.64888889  318.6706576
  343.51600907  369.94031746  380.43573696  409.089161    414.14530612]
[4, 4, 4, 3, 4, 4, 2, 4, 0, 4, 4, 1, 4, 4, 0, 5, 4, 0, 6]

In [38]:
# If available, you can use previously annotated boundaries and a specific labels algorithm
# Set plot = True to plot the results
boundaries, labels = msaf.process(audio_file, boundaries_id="gt", 
                                  labels_id="scluster", plot=True)


Using different Features

Some algorithms allow the input of different type of features (e.g., harmonic, timbral). In this section we show how we can input different features to MSAF.


In [39]:
# Let's check what available features are there in MSAF
print(msaf.AVAILABLE_FEATS)


['hpcp', 'mfcc', 'cqt', 'tonnetz']

In [40]:
# Segment the file using the Foote method for boundaries, C-NMF method for labels, and MFCC features
boundaries, labels = msaf.process(audio_file, feature="mfcc", boundaries_id="foote", 
                                  labels_id="cnmf", plot=True)


Evaluate Results

The results can be evaluated as long as there is an existing file containing reference annotations. The results are stored in a pandas DataFrame. MSAF has to run these algorithms (using msaf.process described above) before being able to evaluate its results.


In [41]:
# Evaluate the results. It returns a pandas data frame.
evaluations = msaf.eval.process(audio_file, boundaries_id="foote", labels_id="fmc2d")
IPython.display.display(evaluations)


D DevE2R DevR2E DevtE2R DevtR2E HitRate_0.5F HitRate_0.5P HitRate_0.5R HitRate_3F HitRate_3P ... HitRate_t3P HitRate_t3R PWF PWP PWR Sf So Su ds_name track_id
0 0.470375 1.71449 1.896168 2.401961 2.909297 0.325581 0.35 0.304348 0.604651 0.65 ... 0.611111 0.52381 0.420751 0.290074 0.76569 0.641772 0.807684 0.532406 01-Sargon-Mindless.jams 01-Sargon-Mindless

1 rows × 25 columns

Explore Algorithm Parameters

Now let's modify the configuration of one of the files, and modify it to see how different the results are. We will use Widgets, which will become handy here.


In [42]:
# First, check which are foote's algorithm parameters:
print(msaf.algorithms.foote.config)


{'L_peaks': 64, 'm_median': 12, 'M_gaussian': 96}

In [43]:
# play around with IPython.Widgets
from IPython.html.widgets import interact

# Obtain the default configuration
bid = "foote"  # Boundaries ID
lid = None     # Labels ID
feature = "hpcp"
config = msaf.io.get_configuration(feature, annot_beats=False, framesync=False, 
                                   boundaries_id=bid, labels_id=lid)

# Sweep M_gaussian parameters
@interact(M_gaussian=(50, 500, 25))
def _run_msaf(M_gaussian):
    # Set the configuration
    config["M_gaussian"] = M_gaussian
    
    # Segment the file using the Foote method, and Pitch Class Profiles for the features
    results = msaf.process(audio_file, feature=feature, boundaries_id=bid, 
                           config=config, plot=True)

    # Evaluate the results. It returns a pandas data frame.
    evaluations = msaf.eval.process(audio_file, feature=feature, boundaries_id=bid,
                                    config=config)
    IPython.display.display(evaluations)


D DevE2R DevR2E DevtE2R DevtR2E HitRate_0.5F HitRate_0.5P HitRate_0.5R HitRate_3F HitRate_3P HitRate_3R HitRate_t0.5F HitRate_t0.5P HitRate_t0.5R HitRate_t3F HitRate_t3P HitRate_t3R ds_name track_id
0 0.495857 1.114853 1.266984 1.266984 1.739093 0.380952 0.421053 0.347826 0.761905 0.842105 0.695652 0.315789 0.352941 0.285714 0.736842 0.823529 0.666667 01-Sargon-Mindless.jams 01-Sargon-Mindless

Collection Mode

MSAF is able to run and evaluate mutliple files using multi-threading. In this section we show this functionality.


In [44]:
dataset = "../datasets/Sargon/"
results = msaf.process(dataset, n_jobs=4, boundaries_id="foote")

In [45]:
# Evaluate in collection mode
evaluations = msaf.eval.process(dataset, n_jobs=4, boundaries_id="foote")
IPython.display.display(evaluations)


D DevE2R DevR2E DevtE2R DevtR2E HitRate_0.5F HitRate_0.5P HitRate_0.5R HitRate_3F HitRate_3P HitRate_3R HitRate_t0.5F HitRate_t0.5P HitRate_t0.5R HitRate_t3F HitRate_t3P HitRate_t3R ds_name track_id
0 0.470375 1.714490 1.896168 2.401961 2.909297 0.325581 0.35000 0.304348 0.604651 0.650000 0.565217 0.256410 0.277778 0.238095 0.564103 0.611111 0.523810 01-Sargon-Mindless.jams 01-Sargon-Mindless
1 0.437517 1.313379 1.618277 1.535261 1.724082 0.222222 0.25000 0.200000 0.800000 0.900000 0.720000 0.146341 0.166667 0.130435 0.780488 0.888889 0.695652 02-Sargon-Shattered World.jams 02-Sargon-Shattered World
2 0.448502 0.504501 12.046519 0.868719 12.243039 0.263158 0.50000 0.178571 0.473684 0.900000 0.321429 0.176471 0.375000 0.115385 0.411765 0.875000 0.269231 03-Sargon-Waiting For Silence.jams 03-Sargon-Waiting For Silence
3 0.318680 1.046576 1.175283 1.064331 1.271236 0.256881 0.27451 0.241379 0.660550 0.705882 0.620690 0.228571 0.244898 0.214286 0.647619 0.693878 0.607143 04-Sargon-The Curse Of Akkad.jams 04-Sargon-The Curse Of Akkad